Early Stopping and Non-parametric Regression: An Optimal Data-dependent Stopping Rule

نویسندگان

  • Garvesh Raskutti
  • Martin J. Wainwright
  • Bin Yu
  • Sara van de Geer
چکیده

Early stopping is a form of regularization based on choosing when to stop running an iterative algorithm. Focusing on non-parametric regression in a reproducing kernel Hilbert space, we analyze the early stopping strategy for a form of gradient-descent applied to the least-squares loss function. We propose a data-dependent stopping rule that does not involve hold-out or cross-validation data, and we prove upper bounds on the squared error of the resulting function estimate, measured in either the L(P) and L(Pn) norm. These upper bounds lead to minimax-optimal rates for various kernel classes, including Sobolev smoothness classes and other forms of reproducing kernel Hilbert spaces. We show through simulation that our stopping rule compares favorably to two other stopping rules, one based on hold-out data and the other based on Stein’s unbiased risk estimate. We also establish a tight connection between our early stopping strategy and the solution path of a kernel ridge regression estimator.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning for non-stationary Dirichlet processes

The Dirichlet process prior (DPP) is used to model an unknown probability distribution, F : This eliminates the need for parametric model assumptions, providing robustness in problems where there is significant model uncertainty. Two important parametric techniques for learning are extended to this non-parametric context for the first time. These are (i) sequential stopping, which proposes an o...

متن کامل

How many samples are needed to build a classifier: a general sequential approach

MOTIVATION The standard paradigm for a classifier design is to obtain a sample of feature-label pairs and then to apply a classification rule to derive a classifier from the sample data. Typically in laboratory situations the sample size is limited by cost, time or availability of sample material. Thus, an investigator may wish to consider a sequential approach in which there is a sufficient nu...

متن کامل

A new strategy for Robbins' problem of optimal stopping

In this article we study the expected rank problem under full information. Our approach uses the planar Poisson approach from Gnedin (2007) to derive the expected rank of a stopping rule that is one of the simplest non-trivial examples combining rank dependent rules with threshold rules. This rule attains an expected rank lower than the best upper bounds obtained in the literature so far, in pa...

متن کامل

Kernel Partial Least Squares is Universally Consistent

We prove the statistical consistency of kernel Partial Least Squares Regression applied to a bounded regression learning problem on a reproducing kernel Hilbert space. Partial Least Squares stands out of well-known classical approaches as e.g. Ridge Regression or Principal Components Regression, as it is not defined as the solution of a global cost minimization procedure over a fixed model nor ...

متن کامل

A Comparison of 2-CUSUM Stopping Rules for Quickest Detection of Two-sided Alternatives in a Brownian Motion Model

This work compares the performance of all existing 2-CUSUM stopping rules used in the problem of sequential detection of a change in the drift of a Brownian motion in the case of two-sided alternatives. As a performance measure an extended Lorden criterion is used. According to this criterion the optimal stopping rule is an equalizer rule. This paper compares the performance of the modified dri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014